Picture for Shangtong Zhang

Shangtong Zhang

Convergence of Two-Timescale Markovian Stochastic Approximations with Applications in Reinforcement Learning

Add code
May 29, 2026
Viaarxiv icon

Latent Q-Barrier Shielding for Safe In-Context Reinforcement Learning

Add code
May 24, 2026
Viaarxiv icon

Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning

Add code
May 06, 2026
Viaarxiv icon

Almost Sure Convergence of Differential Temporal Difference Learning for Average Reward Markov Decision Processes

Add code
Feb 18, 2026
Viaarxiv icon

MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics

Add code
Jan 30, 2026
Viaarxiv icon

Multi-agent DRL-based Lane Change Decision Model for Cooperative Planning in Mixed Traffic

Add code
Jan 16, 2026
Viaarxiv icon

Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL

Add code
Nov 16, 2025
Figure 1 for Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL
Figure 2 for Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL
Figure 3 for Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL
Figure 4 for Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL
Viaarxiv icon

Towards Formalizing Reinforcement Learning Theory

Add code
Nov 05, 2025
Viaarxiv icon

Extensions of Robbins-Siegmund Theorem with Applications in Reinforcement Learning

Add code
Sep 30, 2025
Viaarxiv icon

Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features

Add code
May 27, 2025
Viaarxiv icon